arXiv: 1509.03186vl [physics.ins-det] 10 Sep 2015 


Fisher information vs. signal-to-noise ratio for a split detector 


George C. Kne^ and William J. Munro 
NTT Basic Research Laboratories, NTT Corporation, 

3-1 Morinosato-Wakamiya, Atsugi, Kanagawa 243-0198, Japan 
(Dated: September 11, 2015) 

We study the problem of estimating the magnitude of a Gaussian beam displacement using a two- 
pixel or ‘split’ detector. We calculate the maximum likelihood estimator, and compute its asymptotic 
mean-squared-error via the Fisher information. Although the signal-to-noise ratio is known to be 
simply related to the Fisher information under idealised detection, we find the two measures of 
precision differ markedly for a split detector. We show that a greater signal-to-noise ratio ‘before’ 
the detector leads to a greater information penalty, unless adaptive realignment is used. We find that 
with an initially balanced split detector, tuning the normalised difference in counts to 0.884753 ... 
gives the highest posterior Fisher information, and that this provides an improvement by at least 
a factor of about 2.5 over operating in the usual linear regime. We discuss the implications for 
weak-value amplification, a popular probabilistic signal amplification technique. 


I. INTRODUCTION 

Many historic experiments have concerned 
the detection of a spatial or angular deflec¬ 
tion of a beam of particles or light: exam¬ 
ples include Stern and Gerlach’s discovery of 
spin angular momentum [1], Young’s double 
slit experiment [5] and Germer and Davis¬ 
son’s demonstration of electron diffraction |3]. 
This tradition continues in investigations of 
the spin-Hall effect of light [4] and other opti¬ 
cal phenomena [5]. Often the effect being ob¬ 
served is very subtle, and the use of precision 
instrumentation is necessary to estimate its 
magnitude. Charged coupled devices (CCDs) 
or CMOS Active Pixel Sensor arrays are af¬ 
fordable solid-state technologies found in com¬ 
mercial cameras, which typically feature pixel 
counts in the tens of millions. On the other 
hand, the use of various detection systems in 
the field of quantum imaging, including a sin¬ 
gle pixel camera [6] , has showcased the flexibil¬ 
ity of few-pixel based detection methods when 
teamed with clever illumination strategies. In 
this work we consider how information is lost 
in one such imperfect detection process: the 
amount of information ‘coming out’ of a split 
detector versus the amount of information ‘go¬ 
ing in’. 

A split detector is a popular detection sys¬ 
tem in optics, microscopy, and imaging: it 
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is composed only of a pair of photodiodes, 
arranged so that a lateral displacement of a 
beam of particles impinging upon the detec¬ 
tor induces a modulation in the relative in¬ 
tensity of the photocurrents from each diode. 
Although here it is couched in optical termi¬ 
nology, the idea of split detection or ‘binned’ 
data applies to a wide range of experimental 
scenarios. 

In this article, we study the ability of such a 
device to enable estimation of the magnitude 
of a beam displacement, when lateral beam 
profile is described by a Gaussian function. In 
terms of data processing, we consider a simple 
linear estimator as well as the maximum like¬ 
lihood estimator, and calculate two measures 
of precision: Fisher information and signal-to- 
noise. Our aim is to compare these quantities 
and their meaning. 

In Sec. In] we introduce our model of beam 
displacement, and calculate both the signal-to- 
noise ratio and Fisher information given ideal 
detection. In Sec. cni we introduce the idea of 
pixelated or ‘binned’ data, and take the limit 
of poor resolution in Sec. |IV| where our analy¬ 
sis of split detection begins in earnest. There 
we calculate the posterior Fisher information. 
This quantity describes the performance of the 
maximum likelihood estimator, which we de¬ 
rive in Sec. [V] In Sec. |V^ we derive the signal- 
to-noise ratio of a split detector and show that 
it is not (in general) a good measure of preci¬ 
sion. Sec. |VII| apphes our results to weak-value 
amplification, which is a probabilistic method 
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for increasing signal-to-noise. We draw sev¬ 
eral conclusions in Sec. VIII[ including a short 
discussion on optimal use of split detectors. 


pendent samples Xi is 


N 


= W_p{x^\g), 


( 2 ) 


II. BEAM DISPLACEMENT 


and the maximum likelihood estimator is de¬ 
fined as 


Consider a beam of light, or particles, with 
a lateral intensity profile described by a Gaus¬ 
sian function centred on Xq with standard de¬ 
viation (t: 


p(x) = (1) 

\/27ra 

The intensity could represent flux of quanta of 
energy, or simply the amplitude of the electro¬ 
magnetic field - here we model it as a prob¬ 
ability density function. A displacement of 
the beam is then simply written as p'(x) = 
p(x — d); it might be generated (for example) 
by a magnetic field, an interference effect, or 
through coupling to an interface in the prop¬ 
agation medium. It will be convenient to con¬ 
sider d = Xg, where A is a fixed and known 
parameter and g is an unknown parameter of 
interest, and the subject of a parameter esti¬ 
mation study. 

If one is ignorant of the value of g, but 
permitted a large number of samples from 
p{x — Xg), one may venture a guess, or es¬ 
timate, of the true value. Knowledge of the 
functional form of p'{x) allows for different 
philosophies to inform the estimate. Maxi¬ 
mum likelihood estimation (MLE) is one of the 
most powerful estimation strategies. Under 
reasonable conditions mi, it provides an es¬ 
timate 5 mle for the unknown parameter that i) 
is unbiased E(gMLE) = <7 (E denotes the expec¬ 
tation value), and ii) is efficient , that is to say 
has a mean squared error that decays (when 
the number of trials N —)■ oo) as E((5 mle ~ 
gi)2) = Var(gMLE) = 1/{FN). The propor¬ 
tionality constant F is the Fisher informa¬ 
tion, which depends on p'{x\g). The asymp¬ 
totic mean squared error cannot be beaten by 
any other unbiased estimation strategy, and 
thus MLE is said to saturate the Cramer-Rao 
bound [9]. 

The likelihood of g conditioned on N inde- 


= 0; (3) 

5mle 

one should also ensure that the second deriva¬ 
tive is negative at this point to ensure a max¬ 
imum and not a minimum. For a displaced 
Gaussian beam p'{x\g) = p{x — Xg), it is easy 
to show that 

5mle - 2^ X, ^ . (4) 

i 

The variance of this statistic can simply be 
propagated from the variance of each Xi (note 
Xq is a constant and has zero variance), and 
is given by Var(gMLE) = (t^/{NX^). By the ef¬ 
ficiency of MLE we can then infer the Fisher 
information. Alternatively, what we call the 
‘prior’ Fisher information can be directly cal¬ 
culated 


^MLE • 


dC 


: = 


p'{x) 


dx = 



(5) 


The subscript x denotes an assumption of in¬ 
finite detector resolution. The Fisher informa¬ 
tion is the canonical measure of metrological 
performance. A higher Fisher information in¬ 
dicates the possibility of a lower variance in the 
estimate of an unknown quantity. Efficient es¬ 
timation strategies such as MLE can reach the 
ultimate limit in precision m- 

We imagine A, cr and xq to be parameters 
that may or may not be under the control of 
the experimenter. In the case of an ideal detec¬ 
tor (with infinite resolution), clearly the ratio 
of A to cr should be made as high as possible. 
Note that alignment is only relevant for data 
processing (the estimator i)): the expected 
performance (the Fisher information ([^) does 
not depend on xq. 

An alternative metric often used to char¬ 
acterise the performance of beam displace¬ 
ment experiments m or precision measure- 
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merits more generally, is the signal-to-noise ra¬ 
tio (SNR). We call the ‘prior’ SNR 


I (a:) - xo\ _ ^ 

v/(x2) - {x)^ ~ a ■ 


( 6 ) 


This quantity is defined as the magnitude 
of the ratio of the average displacement to 
the standard deviation (i.e. a function of the 
moments of x) [12] • In fact for the Gaussian 
model 0 considered here 

Rx = g\/^- (7) 


As we show in Sec. lYg Rx is also the first order 
approximant (up to a factor of ^/2J^T) to i?„, 
the (posterior) signal to noise ratio of a split 
detector m- 

For ideal detection, when comparing the 
performance under two different values of A, 
the ‘gain’ for any fixed value of g is the same 
whether one uses the square root of the prior 
Fisher information 


Fx[p{x - Aig)] _ W 

Fx[p{x-\2g)\ A 2 


( 8 ) 


or if one uses the prior SNR 

I?x[Ai] _ Ai 

I?x[A2] A 2 


(9) 


A similar result is found for two values of the 
standard deviation cti and 02 - It can thus be 
tempting to use Rx, as it is far easier to work 
with. Caution must be employed, however, to 
ensure that nothing is being lost by reverting 
to the simpler figure of merit. As we shall see, 
when one uses a split detector, the two figures 
can behave very differently. 


III. HIGH RESOLUTION DETECTION 

The maximum likelihood analysis above will 
fail to apply in a real situation, where some 
sort of pixelation is expected. The Fisher in¬ 
formation under pixelation is surprisingly ro¬ 
bust against reduction in the spatial resolu¬ 
tion of the detector m- If we assume that 
the detector rounds the result of an ideal mea¬ 
surement to the nearest integer multiple of r 


(the pixel width), this can be modelled by an 
appropriate transformation of the probability 
density function into a probability mass func¬ 
tion 


p{x) Pr(n) 


r(n+l/2)r 


' (n—l/2)r 


p{x)dx. 


( 10 ) 


Here n is an integer that indexes the pixels. 
The likelihood is maximised with 


5mle — 


XN 


A ■ 


( 11 ) 


This has a similar form to the unpixelated 
case Q. 

The shift g\ + XQ can be split into an integer 
multiple of r plus a remainder which becomes 
negligible as r —>■ 0. In this regime of high 
resolution, the shift in x is a ‘shift in n’ and 
then Var(ni) ks ^ + F vari¬ 

ance of the maximum likelihood estimator is 
Var(gMLE) ~ A^(^^ + 12 ) which is indepen¬ 
dent of g. The Fisher information penalty due 
to pixelation is the same for any g and any xq 
as long as r is small. m- 


IV. SPLIT DETECTION 


In the opposite limit to high resolution de¬ 
tection, we have split detection: when r be¬ 
comes large enough, only two pixels will be 
relevant. We imagine the split detector to con¬ 
sist of two pixels - one for positive and another 
for negative values of x. Such a detector re¬ 
duces each value to its sign. We assume the 
pixels have infinite extent and carry and in¬ 
dex n € { — I. -I- 1}. The appropriate transfor¬ 
mation on the probability density function is 
therefore 

p'(x) P{n) = ^ l^l-krterf . 

( 12 ) 

One must use the discrete probability mass 
function definition of the Fisher information 
to reach what we call the ‘posterior’ Fisher in- 
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formation 

^ {d,P{n)f 
” ■ ^ P{n) 

n. ' ^ 



(13) 

(14) 


Now if A and a are fixed, one should choose the 
alignment such that xq = —gX for best perfor¬ 
mance. Then one recovers = 2FxItt US]. 
Shifting the detector relative to the centroid 
of p(x) in this way does not affect the prior 
information F^, but it can mitigate the infor¬ 
mation penalty up to a factor of 2/'k. The 
problem is that such realignment depends on 
the unknown quantity g, and is very difficult to 
achieve. An adaptive technique should be pos¬ 
sible, on the other hand, and one can use the 
goal of a symmetric distribution in the split 
detector to guide the alignment. 

Another approach is to fix Xq = 0; i.e. per¬ 
fectly balance the detector before the unknown 
shift is introduced [T6|. Then one has 




g^K = 


Rie 


1 - erf % 


(f!)' 


(15) 


The dependence of and g'^F^ on R,^, is 
shown in Figure There are two competing 
behaviours as Rx is increased: the prior infor¬ 
mation increases, but the information penalty 
of the split detector becomes more severe (the 
centroid of the Gaussian becomes further from 
the pixel boundary). Let us consider the 
Fisher information penalty or ‘transmittance’ 
due to the split detector when xq = 0: 


rjF ■= 


Fx 



(16) 


Notice that this function is decreasing in Rx- 
It attains the upper bound of 2/tt in the limit 
Rx 0. Figure shows that as the prior 
SNR increases, to first order the transmission 
of information is constant. Around Rx = 1 
the transmittance rip decreases linearly in Rx ■ 
These results are in fact entirely intuitive. The 
information about the shift must tend toward 


zero as Rx —>■ oo, because in the limit of a large 
prior SNR, every click lands in the right pixel 
(say), and any measure on the set of possible 
values of g becomes infinite, and it becomes 
impossible to make a nontrivial estimate. The 
Fisher information transmittance rjF captures 
this mathematically. By contrast, attempting 
to define a signal-to-noise transmittance riu := 
Rn/Rx leads to a quantity that exceeds unity, 
as shown in Figure [l] 

By inspection of ( [I^ , when one has con¬ 
trol over Rx, (i.e. over A or cr or both) there 
will be an optimum choice to maximise g^Fn 
given split detection. The optimum value is 
R* = 1.57504... [T^. This optimum point, 
well outside of the linear regime of the detec¬ 
tor, sets a maximum achievable Fisher infor¬ 
mation of F* « 0.60842/g^. This is in contrast 
with high resolution detection where the infor¬ 
mation a) is independent of g and b) can be 
freely increased by scaling A or a. Note that 
adaptive realignment of a split detector effec¬ 
tively reproduces the high resolution situation, 
up to the factor of 2/7r. 


V. SATURATING THE CRAMER-RAO 
BOUND 


Maximum likelihood estimation is an effi¬ 
cient technique for saturating the Cramer-Rao 
bound. Often the estimation procedure is 
complicated enough the necessitate a numeri¬ 
cal approach. For certain scenarios, however, 
an analytic closed form expression for the esti¬ 
mator can be calculated. This makes the com¬ 
putation of the estimate very easy, and can 
also provides insights into experiment design. 
For split detection, the likelihood function to 
be maximized is 


C{g\ni) 


N+\ + N_\ 
N+\NF. 


P{+1)^+ 

(17) 
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FIG. 1. (Color online) a) Prior quantities: the signal-to-noise-ratio Rx (dashed, blue) and the scaled 
Fisher information g^Fx (solid, green) as a function of prior signal-to-noise Rx, assuming an ideal detector 
(note their monotonic relation to one another); b) Information penalty or ‘transmittance’ rjF of Fisher 
information (solid, green) or ga of signal-to-noise (dashed, blue) at the split detector when xo = 0; 
c) Posterior quantities: the signal-to-noise ratio Rn (dashed, blue) and scaled Fisher information g^F„ 
(solid, green) relating to a well aligned {xq = 0) split detector. The measures are no longer monotonically 
related. All quantities are dimensionless. 


where N± is the number of clicks in each pixel. 
Now 


additive constant under the action of the log¬ 
arithm. Preceding with 


0 = (aglog/:)!- 


= dg{N+ log 


1 1 ^ 

2 + 2"' 


a;o + 5A 


-I- 7V_ log 


2 2 


V2a 
Xo + gX 

V2a 


(18) 

(19) 


/ 5mle 


( 20 ) 



one arrives at the maximum likelihood estima¬ 
tor 


V^cr. f N+- N_\ Xo 


iV4 




l + Frf("S?) l-F'f(^) 


V2cr J , 

( 21 ) 


We divided by e 2 ^^ ^ a factor which tends 

to zero when g —>■ ± 00 , extrema correspond¬ 
ing to the minimum likelihood. We also dis¬ 
carded the binomial factor, which became an 


Note the difference to the high resolution es¬ 
timator 0- here we must take account of 
(T. The maximum likelihood estimator is ap¬ 
proximated, when the argument of the invert 
is small, by 


^MLE 


V^cr /N+-N_\ 
~ TIa [n++N_) 


2(0 

A ■ 

(24) 
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In fact this approximation is good until the 
argument approaches around one half. The 
departure of the maximum-likelihood estimate 
from a simple linear estimate is shown in Fig¬ 
ure While the simple linear estimator be¬ 
comes biased, the MLE continues to provide 
an unbiased estimate in the nonlinear regime, 
although of course performance is impacted. 
Interestingly, when xq = 0, the optimum value 
for {N+-N_)/iN+ + N_) is 0.884753.. .which 
follows from the optimum R^. 

VI. SPLIT DETECTION SNR 

As is manifest from Figure Rx breaks 
down as a useful measure of precision for split 
detectors when it becomes too large. In this 
section we investigate whether a faithful appli¬ 
cation of the concept of signal-to-noise ratio to 
the split detector: 

„ iwi 

V N/Sf / 

(25) 

can provide a more useful measure. When 
cco = 0 we have 



When Rx <C v^, we have 

~ ( 27 ) 

but otherwise is not monotonically related 
to Fn- The differences between Rx and F„ are 
not therefore attributable to to the failure of 
linearity in Rx, because the difference in trend 
between Rn and Fn is even more pronounced 
(see Figure away from the linear regime. 
Since Rn —>■ oo as 7?^; —t c», it is therefore 
not advisable to use Rn as a measure of preci¬ 
sion. Even when there is a vanishing amount 
of information about g available, reports 
an arbitrarily high signal-to-noise ratio. 

This serves as a warning against using Rn 
or Rx as hgures of merit, unless Rx v^. 


When that condition is satished, the SNR is a 
good figure of merit and qualitatively captures 
the performance as measured by the Eisher in¬ 
formation. Otherwise the Fisher information 
should be preferred because it takes the non¬ 
linear relationship between what is measured 
(n) and what is estimated (g) into account. 


VII. APPLICATION: WEAK-VALUE 
AMPLIFICATION 

Weak-value amplification is a probabilistic 
signal amplification method which operates 
via quantum interference. Its discovery arose 
from a time symmetric approach to quantum 
theory [18]. By weakly coupling two degrees of 
freedom, usually named ‘system’ and ‘meter’, 
and then postselecting on a unlikely outcome 
of a subsequent strong measurement of the sys¬ 
tem, an unexpected change is seen in the me¬ 
ter m- The meter evolves as if it interacted 
with a ‘system’ of much higher energy. 

The split detector is the detector of choice 
for WVA experiments in optical systems: jU 
[20| , where the transverse state of a light beam 
acts as the ‘meter’ and records (for example) 
the polarisation or which-path state of the 
beam. By employing a weak-value protocol, 
the beam is shifted much further than usual, 
albeit probabilistically. 

Moreover, the signal-to-noise ratio Rx is 
usually preferred in these experiments for its 
being an ‘intuitive concept’ [2T|, leading to 
practical estimation strategies |52| . In fact this 
situation has been considered by Striibi and 
Bruder |23j who calculated an approximation 
to Fn in the linear regime. By contrast we 
imagine that MLE could be employed to un¬ 
lock the non-linear regime of the split detec¬ 
tor, and to guarantee unbiased estimates. As 
we have shown, the Fisher information Fn is 
then not simply related to Rx- 

To allow for a discussion of WVA we imagine 
an ancillary ‘system’ prepared in an eigenstate 
of quantum mechanical observable A with 
eigenvalue A, coupled impulsively to the trans¬ 
verse momentum of our beam with Hamil¬ 
tonian iJ = gA.kx- Since kx describes the 
transverse momentum of the beam, and is the 
generator of translations in a;, this leads to 
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FIG. 2. (Color online) Dependence of two estimators on the observed normalised difference in counts 
at a well aligned (xo = 0) split detector. The blue, dashed line is proportional to the simple linear 
estimator, which can become significantly biased when the difference in counts is outside of the linear 
regime (shaded area). It is a first order approximant to the solid, orange curve, which is proportional 
to the maximum likelihood estimator. The latter is always unbiased and has unbeatable performance 
(mean squared error) when the number of trials is large. The optimum normalised difference in counts 
0.884753... balances the prior Fisher information with the information penalty rjF to achieve the 
optimum posterior Fisher information Fn.. This can be achieved by modulating A or u. 



p{x — gX) as above. Here we assume the ini¬ 
tial state is such that the largest magnitude 
eigenvalue A* is selected. When the coupling 
is weak g ^ Q, however, and the ancillary sys¬ 
tem is prepared in an arbitrary state |i), im¬ 
pulsively coupled and then found to be final 
state I/), one substitutes the largest eigenvalue 
A* with 


(/|A|z) 


(28) 


the ‘weak value’ [H] (here taken to be a real 
number). This quantity can become much 
larger in magnitude than A* by tuning |z) and 
I/). Crucially, however, there is no associated 
increase in the net Fisher information NF^. 
This is because the technique only succeeds 
with a typically small probability q = |(/|z)P, 
leading to only qN successful runs. The reader 
is referred to Ref. [Ml for further details. 

The square of the ‘gain’ serves as a mea¬ 
sure of the relative performance of weak-value 
amplification. We will once more set xq = 0. 


Now 

qNFr- qFr^gF{gA^/a) 

r]F{gXJa) 

VF{gX»/a) ■ 


(29) 

(30) 


The first factor can be taken towards unity, 
but never exceeds it m- The second factor is 
the ratio of transmittances, or Fisher informa¬ 
tion penalties, due to the split detector. It is to 
first order unity, but in general decreasing in 
Ayj/X^,. Either = A* or Ayj A* is neces¬ 
sary for the first factor to approach unity m- 
In the first case the weak-value technique re¬ 
duces to largest-eigenvalue technique and no 
amplification is seen. In the second case there 
is a large amplification, but this is sufficient 
to ensure the second factor is less than unity. 
So real WVA is strictly worse than a maxi¬ 
mum eigenvalue method with a split detector 
- this conclusion is consistent with Striibi and 
Bruder’s claim that one should operate away 
from the regime of large amplification [23] . 












VIII. DISCUSSION 

We have made a formal comparison of 
Fisher information and signal-to-noise be¬ 
fore and after a split detector. Consider¬ 
ing maximum-likelihood estimation as an un¬ 
biased and efficient estimation strategy, we 
found that the signal-to-noise ratio becomes 
misleading as a measure of precision outside 
of the linear response regime of the detector. 
Outside of this regime, simple linear estima¬ 
tion becomes biased. Both of these problems 
can be overcome by using MLE, which is un¬ 
biased across the entire regime of the split de¬ 
tector and efficient (meaning its mean squared 
error is given by F„). 

Furthermore we discussed ways to optimize 
the use of split detectors. Firstly, adaptive re¬ 
alignment of the detector (taking xq —>■ — Ag) 
is a challenging but rewarding technique which 
can in principle recover the ideal Fisher in¬ 
formation Fx up to a multiplicative constant 
of 2/7r. A more realistic technique is to bal¬ 
ance the detector when there is no signal 
xo = 0: then one can achieve higher per¬ 
formance (i.e. an unbiased estimate with a 
lower mean squared error) with the same num¬ 


ber of photons by modulating A/tr such that 
the detecor is operating at an optimum point 
R* outside the linear regime. In optics this 
can be achieved by changing the spot size of 
the beam. In so doing, instead of operating 
close to the limit of the linear response region 
N+- N_ = 0.5{N+ + N_) and « 0.675, 
one can make Rx ~ 1.57504 and so achieve 
approximately a 248% improvement in Fisher 
information. The advantage over operating 
deep within the linear regime (instead of at 
the limit) will be even higher. 

We also showed that in the absence of any 
other technical problems, the use of split de¬ 
tection implies that (real) weak-value amplifi¬ 
cation has a strictly worse performance (lower 
posterior Fisher information) than a standard 
technique. This is similar to the problem of 
nonlinear bias in the weak-value linear esti¬ 
mator m, in that it can be reduced by con¬ 
trolling CT —>■ oo (equivalently when g —>■ 0) 
so that the two strategies performance ap¬ 
proaches parity. 
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